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Abstract 

We present a deterministic algorithm that computes the zeta function of a nonsupersin- 
gular elliptic curve E over a finite field with p n elements in time quasi-quadratic in n. An 
older algorithm having the same time complexity uses the canonical lift of E, whereas our al- 
gorithm uses rigid cohomology combined with a deformation approach. An implementation 
in small odd characteristic turns out to give very good results. 

1 Introduction 

Elliptic curves are a central research object in mathematics, not only centuries and decades ago, 
but even today with a lot of important unsolved problems concerning such curves. The most 
notorious example is of course the conjecture of Birch and Swinnerton-Dyer [T], a solution of 
which is worth a million dollar [18j . In recent times elliptic curves over finite fields have drawn 
the attention of cryptographers, as Koblitz [15] and Miller [19] suggested to exploit the group 
structure on such curves for creating a trapdoor one way function. The motivation for this 
proposal is that computing discrete logarithms is considered to be very hard for most elliptic 
curves, while computing the group operation can be done very fast. A very broad exposition can 
be found in the book [2]. Such one way functions can be used in many cryptographic protocols, as 
for example Diffie-Hcllman key exchange [5 or ElGamal encryption [5] . An important parameter 
needed for estimating the security level of these applications is the order of the group involved, 
in this case hence the order of the elliptic curve. We will further on give a brief overview of the 
large amount of work that has been done on this subject. For now, we content ourselves with 
noting that counting the number of points on curves over a field of characteristic 2 and of sizes 
suitable for cryptography can be accomplished in time (far) less than a second. 

1.1 The zeta function and supersingular curves 

Let E be an elliptic curve defined over the finite field ¥ q with q elements, then we can define its 
zeta function as follows: 

Z(T) := cxp \Y, I q V I , 
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where #E(F q k) is the number of F q k -rational points on E (where E is seen as a projective curve). 
It is well known that Z(T) is actually a rational function, more precisely 



Z{T) 



qT 2 -tT+1 



t e z, \t\ < 2Vg. 



(l-T)(l-qTY 



A proof of this theorem of Hasse and Weil can be found for example in [23] §V.2]. The integer 
t in the zeta function is called the trace of Frobenius, for reasons that will become clear further 
on in this paper. It is not hard to see that the number #E(W q ) of F 9 -rational points on E is 
precisely q + 1 — t. We can conclude that counting the number of points on E is equivalent to 
computing its zeta function or its trace t. 

Curves for which t = mod p are called supersingular, and in [231 §V.4] an easy criterion is 
given for deciding whether a given curve is supersingular. There are only a few possible values for 
the trace of a supersingular curve, a list with a proof can be found for example in |27j . Note that 
if we are given the zeta function of E over ¥ q , it is easy to find the zeta function over extension 
fields of F q . Indeed, if we denote with Zk(T) the numerator of the zeta function of E over F q k, 
then Zk(T) equals the following resultant: 



1.2 Point counting algorithms 

In the following overview we limit our exposition to elliptic curves over finite fields with p n 
elements, where p is a small prime number (e.g. p < 7) and n is the relevant parameter. For 
the complexity estimates — which are always meant bitwise — we use the classical Big-Oh 
notation O, together with the Soft-Oh notation O as defined in [26, Definition 25.8], which 
ignores logarithmic factors. Using the above remark we also ignore the dependency on p of the 
algorithms, being irrelevant for very small primes. 

A very nice and complete overview of the history of elliptic curve point counting can be found 
in chapter 17 of the book [2] by Cohen, Frey e.a. The first general algorithm is due to Schoof, 
and improvements by Elkies and Atkin have led to the well known SEA algorithm, which runs in 
time 0(n 4 ) and requires 0{n 2 ) memory. It is often called '€-adic', because it works by computing 
the trace of Frobenius modulo prime numbers £ p. Having done this for enough small primes 
t, this allows one to recover the trace. 

A different approach was considered by Satoh, who found that p-adic methods might be 
much more efficient for small primes p than the technique of Schoof. Satoh's method is based on 
the canonical lift £ of the curve E. Let Q 9 be the unramificd degree n extension of the p-adic 
field Q p , then £ is defined to be the unique (up to isomorphism) lift of E to Q q which has an 
endomorphism ring that is isomorphic to the one of E, with the isomorphism given by reduction 
modulo p. The idea is then to approximate the j-invariant J of this canonical lift modulo an 
appropriate power of p and afterwards analyzing the action of the gth power Frobenius on the 
lift in order to compute its trace. In later optimizations of the algorithm two main steps arose. 
First we have to solve an equation ip(J, J a ) — over <Q q , where J is congruent modulo p to the 
j-invariant of E, and a : Q q — > Q q is the Frobenius automorphism. A second step consists of 
computing the norm A/q /q of an element of Q 9 . Satoh's original algorithm [20| worked in time 
0(n 3 ) and required C(n 3 ) memory space. After a lot of improvements by Vercauteren [25], the 
AGM of Mestre [17], Satoh, Skjernaa and Taguchi (sst) |21) . and others, a computation time of 
0(n 2 - 5 ) and space 0(n 2 ) was achieved. The fastest method however, working for all finite fields 
of small characteristic, is the patented algorithm of Harley, as described in his e-mail [10]. It 
requires time 0(n 2 ) and memory C(n 2 ), and does not need any precomputations, in contrast to 



Z k (T)^Res x (Z 1 (X),X k -T). 



(1) 
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SST. The basic improvements of Harley are fast ways to compute a good representation of Q„, 
to solve equations of the kind aX a + bX + c = over Q q , and to compute the norm A/q /q of 
an element of Q g . A complete description can be found in section 3.10 of [53] . 

1.3 An 0(n 2 ), 0(n 2 ) algorithm using a rigid lift 

In this paper we describe a new algorithm, that has the same complexity as Harley's result, 
but is based on a different approach. In j!3| Kedlaya gave an algorithm to compute the zeta 
function of a hyperelliptic curve of genus g in odd characteristic in time 0{g A n?) and space 
0(g 3 n 3 ). It uses not the canonical lift (for genus one curves), but a rigid lift, which is trivial 
to compute. If we take the de Rham cohomology of this lifted curve, a Lefschetz fixed point 
theorem of Monsky and Washnitzer tells us that the characteristic polynomial of the Frobenius 
operator on this cohomology yields the zeta function of the curve. Three points are crucial. First, 
if the lift is well-chosen, we can effectively compute in this Monsky- Washnitzer cohomology due 
to it being isomorphic to the de Rham cohomology of the algebraic lift. Second, by cutting 
out Weierstrass points, the action of Frobenius is readily computable. And third, factoring 
the qth power Frobenius in repeated applications of the pth power Frobenius makes sure that 
the appearing power series converge good enough. Later on Denef and Vercauteren extended 
Kedlaya's method to the technically more difficult case of characteristic 2 in [4]. 

In [16j , Lauder used deformation in order to compute the zeta function of higher dimensional 
varieties. This works by putting the variety in a well-chosen one parameter family, say with for- 
mal parameter T, and computing the general matrix -F(r) of the pth power Frobenius. As shown 
by Dwork in [6] such a matrix satisfies a differential equation, the Picard-Fuchs equation of the 
deformation, and this equation allows fast recovery of F(T) modulo a certain power of T. In a 
next step the matrix F(T) is specialized to F(j) for some 7 € Q q , and computing the matrix of 
the qth power Frobenius yields then the zeta function. In [11] and [12] we followed a suggestion 
of Lauder to try to combine such a deformation with Kedlaya's and Denef and Vercauteren's 
algorithm, and this resulted in an 0(n 2 ' 667 ) algorithm for hyperelliptic curves in certain families. 
The most time consuming step in these algorithms is the computation of the 'norm' of the matrix 
F(j), i.e. the ordered product of its conjugates. For elliptic curves we show in this paper that all 
curves can be put in a good family, and that we can reduce the problem to computing the norm 
of just one element of Q g . Using Harley's fast norm computation algorithm this gives then the 
aforementioned complexities. We note that Harley's other basic improvements are also used in 
our algorithm. 

We briefly sketch the structure of this paper. In section [2] we describe how to put a general 
curve in a good linear family defined over the prime field, and in the next two sections we repeat 
briefly how the theory of [11] and [12] allows us to compute the matrix of the pth power Frobenius 
for curves in such a family. In addition we explain how to recover an integral matrix of Frobenius, 
which is not guaranteed by the original algorithms of [11] and [12]. In the fifth section is shown 
how to compute the trace of Frobenius from this matrix, and in the last section we present an 
overview of the algorithm and some results obtained with an implementation of (a variant of) 
the algorithm. 

The author wants to thank Jan Denef for his help on the problem of finding an integral matrix 
of Frobenius in characteristic 2 and Denef and Wouter Castryck for their comments on an early 
version of this paper. 
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2 The curve placed in a one parameter family 



Let E be a nonsupersingular elliptic curve over a finite field ¥ q , given by its Weierstrass equation. 
We will show in this section how to reduce efficiently the equation of E to another equation over 
W q , defining E', such that this last one can be tackled directly using the deformation technique of 
sections [3] and ID The resulting elliptic curve E' will be isomorphic to the original curve or to its 
quadratic twist, which we denote by Twist (E). It is well known that the trace of Frobenius t of 
E equals minus the trace of Frobenius of Twist (E), and hence it suffices to work with E 1 . Note 
that it will be clear in each case which of the two isomorphisms E' = E or E' = Twist (£7) holds. 
We have to stress that these results are certainly not new, but we did not find a good reference, 
and the explicit way to find the curve E' is an important part of a concrete implementation of 
the algorithm. 

2.1 Odd characteristic 

Let p be an odd prime and ¥ q a finite field of order q — p n . We suppose that the elliptic curve 
E over ¥ q is given by 

Y 2 = X 3 + aX 2 + bX + c, a,6,ceF 9 . (2) 

If p 7^ 3 the translation X i— > X — a/3 removes the term with X 2 in ([2|), so we can suppose in 
this case that a = 0. If c = this can be written as Y 2 = X 3 + jX with 7 := b, a form suitable 
for section [31 so we may assume that c ^ 0. Similarly we can assume that b 7^ 0. The notation 
(F 9 ) 2 will be used for the set of squares of ¥ q . 

Proposition 1 Let 7 := b 3 jc 2 and let E' be the elliptic curve over ¥ q defined by Y 2 — X 3 + 
7X + 7. If b/c e (¥ q ) 2 we have that E' = E (over ¥ q ), and otherwise E' ^ Twist(E). 

Proof. Let d be a nonsquare in ¥ q if b/c £ (F 9 ) 2 , and d ;= 1 otherwise. Then there exists 
A G ¥ q such that A 2 = A, and the change of variables Y 1— > \~ 3 Y, X 1— » \~ 2 X transforms 
Y 2 = X 3 + bd 2 X + cd 3 into Y 2 — X 3 + (b 3 /c 2 )X + b 3 /c 2 . Is d is a nonsquare the equation 
Y 2 = X 3 + bd 2 X + cd 3 gives precisely the quadratic twist of E. ■ 

Now we take p = 3. If a = in we can again use proposition [TJ and if a 7^ the translation 
I hI-^ removes the term with X in ((2|). So we can suppose for the next proposition that 
b = and a ^ 0. 

Proposition 2 Let p = 3 and E be given by Y 2 = X 3 + aX 2 + c. Dehne 7 := c/a 3 and the 
elliptic curve E' with equation Y 2 = X 3 + X 2 + 7. If a £ (¥ q ) 2 we have that E' = E, and 
otherwise E' = Twist(E). 

PROOF. If we 'twist' E using a -1 , we find Y 2 = X 3 + X 2 + c/a 3 , and now we can finish as in 
the proof of proposition [T] ■ 

We can conclude that given any elliptic curve in odd characteristic, we can always find 7 € ¥ q 
and some polynomial Q(X, V) over ¥ p such that the following holds: Q(X, T) is monic of degree 
3 in X and linear in V, and it suffices to compute the zeta function of Y 2 = Q(X, 7). In addition, 
this can be done very fast. Indeed, the complexity is dominated by verifying whether b/c (or a) 
is a square in ¥ q , and as x £ (¥ q ) 2 is equivalent to x^ 1 ^ 2 = 1, this can certainly be done in 
time C(n 2 ) and space 0(n). 

In section [3] we will need that Y 2 = Q(X,0) defines an elliptic curve over F p , but this can 
always be achieved by the translation rnTla for some a S ¥ p . It is interesting to make the 
degree in T of the resultant Resx(Q(^, L); -^Q{X, V)) as small as possible (where we interpret 

X A11 such curves are in fact supersingular because their j'-invariant is zero. 
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Q(X, T) e Z p [X, r] for the moment). In proposition Q] this will be 3 and in proposition [5] we find 
degree 2. If 7 G (F g ) 2 in proposition [TJ we can twist over 1/1/7 an d find Y 2 = X 3 + X + 7' for 
some 7' € F g , which also gives a second degree resultant. Although this requires the computation 
of a square root in ¥ q , it might still be advantageous in the end. 

2.2 Characteristic 2 

We now take q = 2 n and E a nonsupersingular curve over ¥ q given by 

Y 2 +a(X + b)Y = X 3 + cX 2 +dX + e with a, 6, c, d, e G ¥ q . 

The fact that -E 1 is not supersingular is easily seen to be equivalent to a / 0. The translation 
X 1 ► X ~\~ b shows that we can suppose that 6 = 0, and with 6 = the translation Y 1— » Y + *J~e 
gives that we can take e = as well. Finally Y h-> a 3 Y and X 1— > a 2 X gives the form 

Y 2 + XY = X(X 2 + AX + B), A,Be¥ q 

as equation for the curve E. Hilbert's Satz 90 shows that a 2 + a + A = has a solution a S F 9 
if and only if Trp /f, (A) = 0. If this trace equals 1 we can take a in a degree 2 extension 
of ¥ q . The change of variables Y 1— > Y + aX yields then the elliptic curve £" with equation 
Y 2 + XY = X(X 2 + B). The conclusion is that E' = E over F 9 if Tr Wq /w 2 (A) = 0, and otherwise 
we have E 1 = Twist As we did not find a relevant reference we prove the following lemma, 
which implies that the sum of the traces of Frobenius of E and E' is zero. 

Lemma 3 The equations Y 2 + XY = X(X 2 + AX + B) and Y 2 + XY = X(X 2 + B), where 
A, B S ¥ q and Tr$ (A) = 1, have together precisely 2q afhne solutions. 

Proof. We show that for every x € ¥ q one of the equations has two solutions, and the other 
has none. If x = both have one solution. Choose x E ¥ q . Replacing Y by xY gives as equation 
Y 2 + Y = x + Bjx + A, and this has (two) solutions if and only if Tr^ q /f 2 (x + B/x + A) = 0. 
The linearity of the trace concludes the proof. ■ 

Analogously we can find for supersingular curves an equation Y 2 + 7Y = X 3 + X 2 with 
similar properties as above. We do not work this out, as we do not need it anyway. 

Define H(X) := X, Qf(X,T) := X 2 + T and 7 := B, then we have proven that it suffices to 
compute the zeta function of the elliptic curve with equation 

Y 2 +H(X)Y = H(X)Q f (X,j), 

where H(X),Q f (X, T) € ¥ 2 [X, T] and 7 E ¥ q . In order to get an elliptic curve for Y 2 +H(X)Y = 
H(X)Q f (X,0) as well, we only have to translate r i-> T + 1, as Y 2 + XY = X(X 2 + 1) does 
define an elliptic curve. Again, the above transformations can be done very fast in practice. The 
most time consuming step is computing Trf q ^f 2 (A), which can certainly be done in time 0(n 2 ). 
The memory requirements are only O(n). 

3 pth power Frobenius in odd characteristic 

Now that we have put our elliptic curve — up to a twist — in a linear family, we will show how 
to compute the matrix of the pth power Frobenius on its Monsky-Washnitzer cohomology. This 
cohomology was first considered by Kedlaya in [13j in an algorithm to count the number of points 
on hyperelliptic curves in odd characteristic. We have worked out this deformation approach in 
great detail in [11] , and we will give a short summary in this section, specified to genus 1 and 
with F p as base field. More details can hence be found in 
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3.1 A sketch of the deformation theory 

We assume in this section that p is an odd prime. Let Q(X, T) e F P [X, T] be of the form explained 
at the end of section [2~Tl in particular monic of degree 3 in X and squarefree for T = 0. Suppose 
7 is the parameter such that we need the zeta function of E : Y 2 = Q(X, ; y), and let the finite 
field F q = F p n be defined as F p [x]/ (p{x) with <p(x) the minimal polynomial of 7 over F p . 

Note 4 The general case can indeed be reduced to this. Suppose that 7 € F p ™ for 1 < n < m, 
then [22] shows how to compute the minimal polynomial of 7 over F p in time 0(m 2 ), and hence 
also the field F q = ¥ p n. Having computed the zeta function over ¥ q , we can use formula (P) to 
conclude the algorithm. 

Denote with Q p the field of p-adic numbers, and with Q q the unique degree n unramificd extension 
of Q p . In fact we need a very specific representation of Q 9 , which will be explained at the end 
of section 13.21 We write Z p and Z q for the rings of integers of these fields, and the Frobenius 
automorphism, a lift of i n x p on ¥ q , is denoted by a. This morphism a is extended with 
er(T) := r p . The valuation on Q q is denoted by ord, normalized to ord(p) = 1. 

The Monsky-Washnitzer construction starts with a degree preserving lift Q(X,T) <G Z P [X, T] 
of Q(X, r). Define the resultant 

r(T) :=Res x (q(X,T); ^Q(X,r 

then we find that f (0) and f (7) (where ~ denotes the reduction modulo p) are both nonzero due 
to the fact that and 7 give (nonsingular) elliptic curves. Write r(T) — r i^ 1 an( ^ ^ ?' be 
the largest index i such that ord(rj) = 0. Then we define R(T) := J2i=o r i^ l i so that R(T) has 
a unit in Z p as leading coefficient and R(T) = r(T) mod p. Define the ring S := Q p [r, l/i?(r)]t, 
where f denotes the overconvergent completion, and the S-module 

Q p [X,Y,l/Y,T,l/R(T)V 



2 " [Y*-Q(X,L')) 
On T act two differential operators, namely d : T — > TdX : v 1— > -^dX and the connection 



V : T — > Tdr : u ^ |idT. The submodule H^ w of TdX/dT is defined as the eigenspace under 
the elliptic involution, and is a free 2-dimensional S'-module. With F p the Frobenius map on 
Hmw> we nna * ^ ne following commutative diagram: 



MW 


V 


* H MW dT 








MW 


V 





(3) 



The basis used in [TT] for H MW is the pair {dX/v^, XdX/ \/Q}, and the diagram ([3]) gives the 
differential equation 

Ajr(r) + F(r)G(r) = pi*-^ (r")F (r). (4) 

for the matrix F(T) of F p with respect to this basis. Here G(T) is the matrix of the connection 
V. Let 7 be the Teichmiiller lift of 7 in Z g , then the matrix ^(7) is precisely the matrix of the 
pth power Frobenius on the Monsky-Washnitzer cohomology as found by Kedlaya in [T5] . 



3.2 Computational issues 

In section[5]we will need the matrix ^(7) up to a certain p-adic precision N = 0(n). Following 
the algorithm in [11) with g = a = k = 1, and limiting ourselves to steps 1 to 7 of the algorithm, 
this can be achieved in time 0(n 2 ) and space C(n 2 ). 
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There are two important points to note. First, we will need that F(~/) is integral, which is a 
priori not guaranteed with our chosen basis if p = 3 (see [HI section 3.5]). Two possible solutions 
emerge. We can imitate the proofs of [TT], but now with the basis {dX/^/Q 3 , XdXj \/Q 3 }, which 
does give an integral matrix. The complexity estimates will all remain the same in this case, and 
this is the solution used in the implementation we made. Another possible work-around is to 
compute the matrix of the change between the two bases, a matrix that becomes integral after 
multiplying with p and is easily retrieved using Kedlaya's algorithm. Transforming F(<y) using 
this matrix yields then an integral version of F("f). 

Second, in the algorithm a particular representation of Q q = <Q p [x]/ip(x) is used, namely ip(x) 
has to be a Teichmiiller modulus lift of <p(x). This means simply that both polynomials are equal 
modulo p, and that (p(x) is a monic divisor of x q — x. Equivalently we can say that tp(x) is the 
minimal polynomial of the Teichmiiller lift 7 of 7. In [3J section 12.1.2] a very efficient algorithm 
for computing tp(x) is given, originally due to Harley, that computes <f(x) in time 0(n 2 ) and 
space 0{n 2 ). 

4 2nd power Frobenius in characteristic 2 

As proven in section 12.21 it suffices to consider curves given by 

Y 2 +XY = X(X 2 +7+1), 7 G F„ q = 2". 

Again we will explain briefly how to compute the matrix of the second power Frobenius on the 
Monsky-Washnitzer cohomology of the curve. It was first shown in [4j how to do this in time 
C(n 3 ) and space C(n 3 ), and in [12] we extended this result so that it worked faster and used 
less memory in one dimensional families. We will now sketch how this works, all details can be 
found in [12]. 

4.1 Computing the matrix of Frobenius 

We suppose as in the previous section that ¥ q is given by F2 [x] divided by the minimal polynomial 
of 7. Define <Q> 2 , Qq, Z 2 , Z q and er as before, and let H(X) := X and Q f (X,T) := X 2 + V + 1. 
The polynomial c(T) from [12] is just equal to 1 in our case. The resultant needed is r(T) = 
Resx(H; Qf-J^H) = T + 1, and clearly both f(0) and f(7) are nonzero in ¥ q . Moreover, defining 
R(T) as before yields R(T) = r{T). The ring S is defined by S := Q 2 [r,l/(r + l)]t and the 
S'-module T by 

_ Q 2 [x,r,i/A,r,i/(r + i)]t 

(Y 2 + XY - X(X 2 + r + 1)) ' 

Using the definitions of d, V, H^ IW as before, we find again diagram ©, with B := {YdX, XYdX} 
as basis for H^ IW . Here too we get ^(7), using the Teichmiiller modulus representation of Q q , 
with precision N = O(n). However, in order to get an integral matrix our chosen basis does not 
suffice, indeed, from the proof of proposition 11 from [T^] follows that only 2 6 • ^(7) is guaranteed 
to be integral. We will show in the next subsection how to solve this problem. The conclusion 
will be that we have to compute ^(7) modulo 2 N+1Q , and can transform it afterwards into a 
matrix of Frobenius modulo 2 N with integral coefficients. As follows from the algorithm of [12] . 
we can find this approximation of ^(7) in time 0(n 2 ) and space 0(n 2 ). 

We would like to mention that in [3] Gerkmann considered a deformation for the same family 
Y 2 + XY = X(X 2 + 7) that we used above. 
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4.2 An integral matrix of Frobenius 

We now give a sketch of how to remedy the 'integrality problem'. The eigenvalues of the <?th 
power Frobenius map are the reciprocal zeroes of the numerator of the zeta function, and hence 
integral. This implies that a Z g -submodule of H^ IW does exist that is stable under this map. 
In Proposition 5.3.1], Edixhoven showed how to find a basis for this submodule, and in 
[3] Denef and Vercauteren applied this to their characteristic 2 situation. It turns out that 
^ := { 2Y+x ' 2y+x } * s sucn an 'integral basis'. It might be possible to reconstruct the algorithm 
explained above using this basis, but here we will explain how to use the matrix of the change 
of basis in order to achieve an integral matrix of Frobenius. 

We now briefly recall the result of [3], specialized to our situation. The modules H\ and H± 
are as defined in Denef and Vercauteren's paper [4], essentially they are the modules TdX/dT and 
H-mw a bove specialized to T = 7. The curve E : Y 2 + XY — X(X 2 + 7+ l) = 0isa smooth and 
proper curve over Z g , and -E\{-Poo} is affine, with P^ the point at infinity of E. Let D = kPoc 
be a divisor on E with k > 2. We define the Z g -module L as consisting of those differentials ui 
on -E\{-Poo} satisfying the following two conditions. First, we require that div(w) + D > 0, and 
second, each term with valuation less than —1 in the local expansion of uj at P^ is integrable 
over Z g . Then the image of L in Hi is independent of the choice of the divisor D and invariant 
under the pth power Frobenius, and L generates H\. Hence we have also that L D -frf generates 
iff, and D will be a basis for L n fTf as Z g -module. 

First we need a lower bound on the valuation of the matrix of change of basis and its inverse. 
The differential forms YdX and XYdX from B have a pole of order 6 respectively 8 at the point 
Poo. If we take D = 8Poo, both forms satisfy the condition div(cj) + D > 0, and 4w for w e 6 will 
also satisfy the second condition on the integrability. Indeed, during integration only —7, . . . , —1 
can appear as denominators, and 4 divided by one of these is always integral in Z 2 . This implies 
that both 4YdX and AXYdX are in the Z g -module L, which has T> as basis, and hence the 
matrix defining the change of basis B to T> has valuation at least —2. 

For the inverse we have to reduce the basis V to B and use the lemmata 2 and 3 of [4]. As 

dX (2Y + X)dX 




2Y + X AX{X 2 +7+ I) + X 2 
an easy computation gives as lower bound for the matrix of this change of basis 



mnW min(2fc- 3 - |log 2 (fe + 1)J ); min(2fc - 3 - Llog 2 (fc + 3)J) \ + 1 > -2. 

Computing this last matrix, denoted with B, modulo 2 M with M = 0(n) is easy using the 
reduction formulae in [4], but this would require time C(n 3 ). We can see however that we do not 
need B modulo such high power of 2. Indeed, let B 1 be any invertible matrix over <Q q such that 
F' := (B 1 ^ 1 ) 17 F{ r ))B' is integral, then B' gives the change to an unknown but irrelevant basis, 
and the resulting integral matrix F 1 is still a matrix of Frobenius. So let B' = B mod 2" for 
some a, then if (B'~ 1 ) (T F{^)B' is integral we are done. We will show that a = 0(1) suffices, and 
as a consequence the algorithm of [4] allows us to compute B' in time 0{n 2 ) and space 0(n 2 ). 

From the valuation bound —2 on B^ 1 above we see that ord(detP) is bounded by 4, and 
hence working modulo 2 6 suffices already to be able to invert B' (which has to be done to the full 
precision 2 N+W ). As soon as the 2-adic precision of B' exceeds 8 bits, the sum of the valuation 
bounds for ^(7) and B'^ 1 , the resulting product (B'~ 1 )' 7 F(-f)B' will be an integral matrix of 
Frobenius as required. Hence taking a := 9 = 0(1) suffices. The loss in precision in this product 
is at most 2 + 6 + 2, hence we have to compute ^(7) modulo 2 N+1Q . 



5 An eigenvalue of the qth power Frobenius 



In this section we will first show that it suffices to compute an approximation of an eigenvalue of 
the matrix of the qth power Frobenius, and we reduce this to computing an 'eigenvalue' of F(j), 
in fact an eigenvalue of the a-linear Frobenius map F p . In a second subsection we explain how 
to solve this last problem, by showing that we can always satisfy certain conditions required for 
an algorithm that computes solutions of a specific type of p-adic equation. 



5.1 Reduction to an 'eigenvalue' of F^) 

Suppose that E is a nonsupersingular curve over ¥ q , where q — p n , and F — F(-y) is the matrix 
of the pth power Frobenius on its Monsky-Washnitzer cohomology over Z g , as explained in the 
two previous sections. For 

f.-p^- 1 . f<?"~ 2 ■ ■ ■ F 17 ■ F, 

the matrix of the gth power Frobenius, Kedlaya [T3] and Denef and Vercauteren [3] showed that 
we have, with Z(T) the zeta function of E over F 9 , 

Z(T)= dct d-^) 



(l-T)(l-qTY 

If we write qT 2 — tT + 1 for the numerator of the zeta function, it follows immediately that 
det(.F) = q and Trf^) = t. Let Ai and A2 be the eigenvalues of J 7 , then we will prove in 
the next subsection that Ai,A2 € Z 9 , and that we may suppose that ord(Ai) = and hence 
ord(A2) = ord(q/Ai) = n. We are trying to compute t = Tr(.F) = Ai + q/X\. The Hasse-Weil 
bound saysd that \t\ < 2^/q, hence we only need to compute Ai modulo p N with 

N := [log p (4^)l = \n/2 + log p (4)l = 0(n), (5) 

which is smaller than n if n is not too small. To conclude, if suffices to compute Ai modulo p N 
in order to find the zeta function of E: the trace t is then the unique integer congruent to A 
modulo p N that satisfies \t\ < 2^/q. 

If we have matrices C and D over Z 9 such that F — C a DC~ 1 with D in uppertriangular 
form, this implies 

T = C* [d^' 1 ■ D a "~ 2 ■ ■ ■ D a ■ dY C'- 1 , 

and with fj, the upper diagonal element of D this gives that the norm A/q /q (^) is an eigenvalue 
of T . We will show in section 15.21 that such \i with valuation can always be found efficiently if 
E is not supersingular. It is easily seen that a factorization F = C a DC^ 1 over Q q cannot exist 
if the curve is supersingular: the product of the two diagonal elements has valuation one, and 
their sum has then valuation at least one. This is clearly impossible as the valuation is a map 
from Q q to the integers. 

Having found fi we still have to compute its norm. For this we can apply an algorithm by 
Harley, which uses an adaptation of Moenck's extended GCD algorithm in order to compute a 
certain resultant. Indeed, if Z q — T, p [x]/tp(x) with ip{x) a Teichmiiller modulus, and fi(x) £ 
Z p [x]/<f(x), then 

Mj,/Q p (m) = Res x {n(x),tp(x)). 

A complete description of the algorithm has been given by Vercauteren and can be found in 
[2"4l Section 3.10.3]. It requires 0(n 2 ) time and 0(n 2 ) space. As noted there, in order for the 
algorithm to work well n(x) should have a unit in Z p as leading coefficient. This is however easily 



2 In fact, the Hasse-Weil bound shows that \t\ < but equality can only occur for supersingular curves. 
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forced: suppose n{x) mod p has degree n — 1 — r, then x r n(x) satisfies this condition. Moreover, 
x r itself satisfies the condition as well, hence computing Af(fi) — N \x r [i{x)) / 'N (x 7 ') gives the 
required result. Note that x r is a Teichmiiller lift with J\f(x r ) = f(0) r , and its norm can thus be 
computed much faster. 



5.2 Computation of an 'eigenvalue' \i of F(j) 

In this subsection = will always mean 'congruence modulo p\ unless 'modp^' is explicitly writ- 
ten. We will need the following algorithm of Harley, which can be found as algorithm 3.10.2 in 
[2j. Note that this algorithm requires Z q to be given as 7L v \x\ modulo a Teichmiiller modulus. 
INPUT : ip(X, Y) G 1 q [X, Y] , x a G Z q such that 

dip dip 
ip(x ,x%) - Q^{x ,x^) = 0, Qy(xo,x%) ^ 0, 

OUTPUT : aeZ q such that 

a = xo, i>(a, a a ) = mod p N . 

Following the complexity estimates found in [2], it is easily shown that if the degree of ip is fixed, 
the algorithm runs in time 0(nN) and space (D(nN). 

Write F — F(j) as ( 1 2 J with all fi in Z 9 , and consider the system of equations 



/3 fi 

fi M fl\ f 1\ \ fi+af 2 =fi 



(6) 

h h) W ^ V"° 7 ' [f3 + ah = m a - 

It is clear that if we can find a solution (a, /i) G Z 9 x Z* mod p N for ([5]), this yields a factorization 
of F = (C r7 )DC~ 1 , which is of the kind that we are looking for. Here C and D equal 



Eliminating p from the equations ([6]) gives 

a(a°f 2 - U) + (a^fr - / 3 ) = 0. (7) 

If fi = f 2 = 0, certainly one of /3, /4 will not be zero modulo p, as ord(det(F)) = 1. In this case 
we can work with (a 1) T instead of (1 a) T . So we can suppose that at least one of /i or f 2 is 
nonzero modulo p. Let 

mod p J , or := mod p 

h J V/i 

If both definitions make sense, det(i^) = implies that they are equal. Computing the corre- 
sponding xq is easy finite field arithmetic. We define the polynomial i{j(X,Y) by 

1>{X,Y) := X(Yf 2 - U) + (Y/j - / 3 ) G Z q [X,Y]. 

Our choice of x% guarantees that ip(xo, Xq) = and also 

o 

— 1p(x , Xq) = Xlh - fi = 0. 

This last inequality holds even if /2 = 0. We will show immediately that -^^(xo, Xq) ^ follows 
from nonsupersingularity. The algorithm from the beginning of this section allows us now to 
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compute aeZ ? and hence \i with precision N = 0(n) in time 0(n 2 ) and 0(n 2 ). In addition, 
eliminating a from ([5]) yields 

M/4 - «"/ 2 ) = A/4 - /2/3, 

which equals det(F) and has valuation 1. As — a a fa = 0, it is impossible that ord(/i) > as 
well. 

Suppose that -^^(xq^Xq) = x f 2 + fi = 0. If / 2 = this would imply fx = 0, which we 
excluded. Define = fi/ f 2 , then 

fi — ~~~ x 0j fi = x o = x o> /a = /1/4 = — :c o + • 

As a consequence 

This implies that the trace of T is congruent to zero modulo p, and hence the curve considered 
is supersingular. 

6 Conclusion and implementation results 

Combining all steps explained in sections [2j [3j [4] and [5] above, we have found a deterministic 
algorithm that for every elliptic curve over ¥ p n given by its Weierstrass equation, can compute 
its zeta function in time 0(n 2 ) and space 0(n 2 ). We will now give a list of the main steps of the 
algorithm. We assume that we are working in odd characteristic, and with an 'integral basis' for 
the Monsky-Washnitzer cohomology HJ {W . We do not mention in the algorithm that we only 
compute approximations of the objects involved. If p n is so small that N > n in (|5|), we can use 
a naive point counting algorithm. 

INPUT: Finite field F p «, monic squarefree polynomial Q(X) G F p «[X] of degree 3, 
OUTPUT: The zeta function of the elliptic curve Y 2 = Q(X) over F p n. 

STEP 1 : Put the curve in a one parameter family Y 2 = Q(X, 7), where 7 G F p « , as explained in 
section [2j 

STEP 2: Compute the matrix of Frobenius F(0) of Y 2 — Q(X,0), and the differential equation 
for F(r). 

STEP 3: Solve the differential equation and find F(T) S Z p [[r]] 2x2 . 

STEP 4: Determine (p{x) G F p [izi], the minimal polynomial of 7, and define F p ™ := ¥ p [x]/(p(x). 
Lift (p{x) to a Teichmiiller modulus so that Z p m = Z p [x]/(p(x) and x = 7. 

STEP 5: Compute ^(7) by reducing F(T) modulo ^(r). 

STEP 6 : Compute a solution (a, /1) with ord(/^) = for the equation 



STEP 7: Compute t\ = Afq m /q p (n) modulo an appropriate power of p, such that |ii| < l^pp?' 
Compute then the resultant 



n T 2 _ tT+l = Res x (p" l A 2 - t x X + 1, X n/m - T). 
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STEP 8: Return 



p n T 2 _ tT+ l 

(1 - T){l-p n T)' 



We have implemented this algorithm in odd characteristic, and present a few timing results 
obtained with it. Note that we do not use Harley's 0(n 2 ) norm algorithm for step 7, but instead 
the — far easier to implement and in practice probably faster for reasonable n — algorithm of 
Satoh, Skjernaa and Taguchi [21j . This method runs in time 0(n 2 5 ) given some precomputations. 
These precomputations require time 0(n 3 ), but are completely integer arithmetic and hence 
extremely fast. In our algorithm they are necessarily part of the algorithm (they depend on 
<p(x), the minimal polynomial of the parameter 7), so our implementation has as theoretical 
complexity C(n 3 ). In step 2 the matrix -F(O) is computed using an implementation of Kedlaya's 
algorithm by Michael Harrison. 

The implementation has been made in the computational algebra system Magma V2.13-3, 
and the timing results were obtained on an AMD Athlon 64 3000+ , using 1GB of physical mem- 
ory. The algorithm received as input a random elliptic curve over F p n , given by its Weierstrass 
equation. All times in the following table are in seconds. 



p\n 


50 


100 


250 


500 


1000 


2000 


4000 


3 


.35 


.78 


4.47 


18.43 


99 


604 


4293 


5 


1.18 


2.73 


11.43 


47.92 


227 


1389 




7 


3.27 


7.79 


41.85 


186.48 


957 


5592 





It is interesting to see that for n 3> almost all computation time goes to steps 6, 7 and the 
computation of the Teichmuller modulus in step 4, the first two being comparable in required 
time. E.g. for p n = 3 4000 we have as total time 4293 seconds, where step 6 uses 1916 seconds 
and step 7 uses 2190 seconds. For p n = 7 2000 the computation of ip(x) takes 3910 seconds. 
A conclusion that could be drawn from this is that for such big fields our algorithm should 
work faster than Harley's — as long as in either algorithm the same norm algorithm and no 
precomputation is used — because he needs a computation similar to step 6 but with an equation 
ip of higher degree, and exactly the same field polynomial and norm computation. 

Steps 2 and 3 can be considered as precomputation, meaning that they only depend on the 
field size (and the structure of the family in which the curve lives). For n big enough these 
steps are of minor influence, but for fields of cryptographic size it is worth looking at the time 
needed for just one curve. The following table gives these times for the field sizes as above, hence 
ignoring the time for steps 2 and 3 of the algorithm. 



p\n 


50 


100 


250 


500 


1000 


2000 


4000 


3 


.15 


.46 


3.65 


16.58 


95 


592 


4252 


5 


.26 


.92 


6.77 


36.55 


198 


1306 




7 


1.76 


4.87 


34.06 


167.56 


909 


5447 
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